Project Topic

This project will examine consumer shopping trends and purchase behaviors using the Customer Shopping (Latest Trends) Dataset. The analysis will focus on uncovering patterns in retail purchasing across various product categories, customer demographics, and purchase channels.

Data Sources

Description of the Data

The dataset offers a comprehensive view of consumer shopping trends, aiming to uncover patterns and behaviors in retail purchasing. It contains detailed transactional data across various product categories, customer demographics, and purchase channels. Key features may include:

Ideas about the figures that you will create to visualize this data:

Based on the available data, here are three proposed figures that could provide valuable insights into consumer shopping trends:

  1. Bar Chart: Distribution of purchases across different product categories, highlighting the most popular categories.

  2. Stacked Bar Chart: Breakdown of purchase amounts by gender and age group, showing spending patterns across demographics.

  3. Line Plot: Seasonal trends in purchase amounts over time, revealing peak shopping periods and potential cyclical patterns.

Import Your Data

In the following code hunk, import your data.

Figure 1

Scatter Plot: Relationship between review ratings and purchase amounts

# Scatter Plot
ggplot(shopping_trends, aes(x = `Review Rating`, y = `Purchase Amount (USD)`)) +
  geom_point() +
  theme_minimal() +
  labs(title = "Relationship Between Review Ratings and Purchase Amounts",
       x = "Review Rating",
       y = "Purchase Amount (USD)")

Figure 2

BoxPlot: to compare the purchase amounts across three different product categories: Clothing, Footwear, and Accessories.

library(ggplot2)
library(dplyr)

# Filter the data for the selected categories
selected_categories <- shopping_trends %>% filter(Category %in% c("Clothing", "Footwear", "Accessories"))

# Boxplot
ggplot(selected_categories, aes(x = Category, y = `Purchase Amount (USD)`, fill = Category)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "Boxplot of Purchase Amounts Across Product Categories",
       x = "Product Category",
       y = "Purchase Amount (USD)")

Figure 3

Heatmap: Correlation between customer age, purchase frequency, and average transaction value

library(ggplot2)
library(dplyr)
library(tidyr)
library(reshape2)
## 
## Adjuntando el paquete: 'reshape2'
## The following object is masked from 'package:tidyr':
## 
##     smiths
# Calculate the correlation matrix
cor_matrix <- cor(shopping_trends %>% select(Age, `Previous Purchases`, `Purchase Amount (USD)`))

# Heatmap
heatmap_data <- melt(cor_matrix)

ggplot(heatmap_data, aes(Var1, Var2, fill = value)) +
  geom_tile() +
  scale_fill_gradient2(low = "blue", high = "red", mid = "white", midpoint = 0) +
  theme_minimal() +
  labs(title = "Correlation Heatmap",
       x = "Variable",
       y = "Variable",
       fill = "Correlation")

Figure 4

Bar Chart: Distribution of Purchases by Category Over Time

library(ggplot2)
library(dplyr)

# Bar Chart
ggplot(shopping_trends, aes(x = Category)) +
  geom_bar(fill = "skyblue") +
  theme_minimal() +
  labs(title = "Distribution of Purchases Across Product Categories",
       x = "Product Category",
       y = "Count")

Figure 5

Pie Chart: Distribution of payment methods used

# Pie Chart
payment_method_counts <- shopping_trends %>%
  count(`Payment Method`)

ggplot(payment_method_counts, aes(x = "", y = n, fill = `Payment Method`)) +
  geom_bar(stat = "identity", width = 1) +
  coord_polar("y") +
  theme_minimal() +
  labs(title = "Distribution of Payment Methods")

Figure 6

Line Plot: Seasonal trends in purchase amounts over time

# Line Plot
ggplot(shopping_trends, aes(x = Season, y = `Purchase Amount (USD)`, group = 1)) +
  geom_line(stat = "summary", fun = "mean") +
  theme_minimal() +
  labs(title = "Seasonal Trends in Purchase Amounts",
       x = "Season",
       y = "Average Purchase Amount (USD)")

Figure 7

Stacked Bar Chart: Breakdown of purchase amounts by gender and age group

# Convert age groups to factors
shopping_trends <- shopping_trends %>%
  mutate(AgeGroup = cut(Age, breaks = c(0, 20, 40, 60, Inf), labels = c("0-20", "21-40", "41-60", "60+")))

# Stacked Bar Chart
ggplot(shopping_trends, aes(x = AgeGroup, y = `Purchase Amount (USD)`, fill = Gender)) +
  geom_bar(stat = "identity") +
  theme_minimal() +
  labs(title = "Breakdown of Purchase Amounts by Gender and Age Group",
       x = "Age Group",
       y = "Purchase Amount (USD)")

Figure 8

Density Plot: Distribution of purchase amounts by gender

# Density Plot
ggplot(shopping_trends, aes(x = `Purchase Amount (USD)`, fill = Gender)) +
  geom_density(alpha = 0.5) +
  theme_minimal() +
  labs(title = "Distribution of Purchase Amounts by Gender",
       x = "Purchase Amount (USD)",
       y = "Density")

Figure 9

Interactive Scatter Plot: showing the relationship between review ratings and purchase amounts

library(plotly)
## 
## Adjuntando el paquete: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
# Interactive Scatter Plot
scatter_plot <- plot_ly(shopping_trends, x = ~`Review Rating`, y = ~`Purchase Amount (USD)`,
                        type = 'scatter', mode = 'markers',
                        marker = list(color = 'rgba(152, 0, 0, .8)', size = 10)) %>%
  layout(title = "Relationship Between Review Ratings and Purchase Amounts",
         xaxis = list(title = "Review Rating"),
         yaxis = list(title = "Purchase Amount (USD)"))

scatter_plot